Approximate Frequent Pattern Discovery Over Data Stream
نویسندگان
چکیده
Frequent pattern discovery over data stream is a hard problem because a continuously generated nature of stream does not allow a revisit on each data element. Furthermore, pattern discovery process must be fast to produce timely results. Based on these requirements, we propose an approximate approach to tackle the problem of discovering frequent patterns over continuous stream. Our approximation algorithm is intended to be applied to process a stream prior to the pattern discovery process. The results of approximate frequent pattern discovery have been reported in the paper. Keywords—Frequent pattern discovery, Approximate algorithm, Data stream analysis.
منابع مشابه
A Sliding Window Algorithm for Relational Frequent Patterns Mining from Data Streams
Some challenges in frequent pattern mining from data streams are the drift of data distribution and the computational efficiency. In this work an additional challenge is considered: data streams describe complex objects modeled by multiple database relations. A multi-relational data mining algorithm is proposed to efficiently discover approximate relational frequent patterns over a sliding time...
متن کاملDisplaying Co-occurrences of Patterns in Streams for Website Usage Analysis
One way of getting a better view of data is by using frequent patterns. In this paper frequent patterns are (sub)sets that occur a minimal number of times in a stream of itemsets. However, the discovery of frequent patterns in streams has always been problematic. Because streams are potentially endless it is harder to say if a pattern is frequent or not. Furthermore, the number of patterns can ...
متن کاملDistributed and Stream Data Mining Algorithms for Frequent Pattern Discovery
The use of distributed systems is continuously spreading in several applications domains. Extracting valuable knowledge from raw data produced by distributed parties, in order to produce a unified global model, may presents various challenges related to either the huge amount of managed data or their physical location and ownership. In case data are continuously produced (stream) and their anal...
متن کاملAn Efficient Algorithm for Mining Frequent Itemests over the Entire History of Data Streams
A data stream is a continuous, huge, fast changing, rapid, infinite sequence of data elements. The nature of streaming data makes it essential to use online algorithms which require only one scan over the data for knowledge discovery. In this paper, we propose a new single-pass algorithm, called DSMFI (Data Stream Mining for Frequent Itemsets), to mine all frequent itemsets over the entire hist...
متن کاملFrequent Episode Mining Over the Latest Window Using Approximate Support Counting
In this paper we propose a streaming approach for the discovery and monitoring of frequent patterns (the episodes) within the recent past of an event stream. This approach is based on the heuristic computation of the estimated support of the episodes of interest, and allows the fast discovery of frequent episodes with limited information storage. In particular, we do not even need to store the ...
متن کامل